Performance Characteristics of OpenMP Language Constructs on a Many-core-on-a-chip Architecture
نویسندگان
چکیده
Recent emerging many-core-on-a-chip architectures present massive on-chip parallelism through hardware support for multithreading. In order to achieve fast development of parallel applications that exploit this massive intrachip parallelism to achieve highly sustainable performance, suitable programming models are needed. OpenMP, the industry de facto standard for writing parallel programs on shared memory systems, could become a reasonable candidate. To increase our understanding of the behavior and performance characteristics of OpenMP programs on many-core-on-a-chip architectures, this paper presents a performance study of basic OpenMP language constructs on the IBM Cyclops64 architecture, which consists of 160 hardware thread units in a single chip. Compared with previous work on conventional SMP systems [1], the overhead of OpenMP language constructs on C64 many-core architecture is at least one order of magnitude lower.
منابع مشابه
Coprocessors: An Early Performance Comparison
The demand for more and more compute power is growing rapidly in many fields of research. Accelerators, like GPUs, are one way to fulfill these requirements, but they often require a laborious rewrite of the application using special programming paradigms like CUDA or OpenCL. The Intel R © Xeon Phi TM coprocessor is based on the Intel R © Many Integrated Core Architecture and can be programmed ...
متن کاملEfficient Programming for Multicore Processor Heterogeneity: OpenMP versus OmpSs
ARM single-ISA heterogeneous multicore processors combine high-performance big cores with power-efficient small cores. They aim at achieving a suitable balance between performance and energy. However, a main challenge is to program such architectures so as to efficiently exploit their features. In this paper, we study the impact on performance and energy trade-offs of single-ISA architecture ac...
متن کاملCost-aware Topology Customization of Mesh-based Networks-on-Chip
Nowadays, the growing demand for supporting multiple applications causes to use multiple IPs onto the chip. In fact, finding truly scalable communication architecture will be a critical concern. To this end, the Networks-on-Chip (NoC) paradigm has emerged as a promising solution to on-chip communication challenges within the silicon-based electronics. Many of today’s NoC architectures are based...
متن کاملExplicit Vector Programming with OpenMP 4.0 SIMD Extensions
Modern CPU and GPU processors with on-die integration of SIMD execution units for achieving higher performance and power efficiency have posed challenges to use the underlying SIMD hardware (or VPUs, Vector Processing Unit) effectively. Wide vector registers and SIMD instructions –Single Instructions operating on Multiple Data elements packed in wide registers such as AltiVec [2], SSE, AVX[10] ...
متن کاملOpenMP Implementation and Performance on Embedded Renesas M32R Chip Multiprocessor
CMP (Chip Multiprocessor) is a promising processor architecture, not only for high performance but also for reducing power and energy consumption in embedded applications. We have implemented an OpenMP compiler for an embedded Renesas M32R chip multiprocessor as a parallel programming environment. In this paper, we report the preliminary performance of OpenMP benchmarks, including scientific an...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006